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Abstract 


We  apply  the  methods  of  random  matrix  theory  to  search  for  relationships  between  National 
Procurement  spending  and  the  performance  of  the  CC130  fleet.  By  understanding  the 
eigenvalue  spectrum  of  correlation  matrices  connected  to  performance  and  spending,  we 
construct  the  minimal  spanning  tree  of  the  system  to  identify  networked  hierarchies  in 
the  data.  We  find  that  no  meaningful  relationship  exists  between  spending  and  high  level 
performance  indicators,  suggesting  that  the  fleet  responds  to  spending  shocks  in  an  inelastic 
manner.  The  results  indicate  that  the  CC130  fleet  is  maintained  robustly  and  that  funding 
has  not  fallen  below  a  critical  level  that  would  induce  correlations  between  spending  and 
performance.  The  techniques  we  apply  in  this  study  can  be  applied  generally  to  any  project 
that  requires  an  understanding  of  correlations  in  data. 


Resume 


Nous  appliquons  les  methodes  de  la  theorie  des  matrices  aleatoires  pour  decouvrir  com¬ 
ment  les  depenses  d’ appro visionnement  national  et  le  rendement  de  la  flotte  de  CC130 
sont  interrelies.  En  comprenant  le  spectre  des  valeurs  propres  des  matrices  de  correlations 
se  rapportant  au  rendement  et  aux  depenses,  nous  construisons  l’arbre  maximal  minimal  du 
systeme  dans  le  but  de  cemer  les  hierarchies  intriquees  dans  les  donnees.  Nous  decouvrons 
qu’il  n’existe  pas  de  lien  significatif  entre  les  depenses  et  les  indicateurs  de  rendement  de 
haut  niveau,  ce  qui  donne  a  penser  que  la  flotte  a  une  reaction  inelastique  aux  chocs  de 
depenses.  Ces  resultats  revelent  que  l’entretien  de  la  flotte  des  CC130  est  robuste  et  que  le 
financement  n’est  pas  tombe  sous  le  seuil  critique  qui  se  traduirait  par  des  correlations  entre 
les  depenses  et  le  rendement.  Les  memes  techniques  peuvent  etre  appliquees  a  n’importe 
quel  projet  pour  lequel  il  faut  comprendre  les  correlations  entre  les  donnees. 
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Executive  summary 

A  Random  Matrix  Theory  Approach  to  National 
Procurement  Spending 

David  W.  Maybury;  DRDC  CORA  TM  201 0-168;  Defence  R&D  Canada  -  CORA; 

August  2010. 

Over  the  last  seven  years,  ADM(Mat)  has  sought  a  deeper  understanding  of  the  relationship 
between  National  Procurement  (NP)  spending  and  fleet  performance  in  the  hopes  that  such 
linkages  would  provide  a  first  step  in  the  development  of  a  funding  optimization  procedure. 
Furthermore,  an  understanding  of  spending  effects  on  fleet  operations  might  also  provide 
insight  into  optimal  sparing  levels,  optimal  maintenance  activities  and  schedules,  and  opti¬ 
mal  replacement  times.  In  this  study  we  take  an  approach  different  from  past  studies  that 
attempted  to  uncover  relationships  in  fleet  performance  and  spending.  We  apply  results 
from  random  matrix  theory  and  graph  theory  to  search  for  relationships  in  the  data.  To 
demonstrate  the  methods,  we  focus  our  study  on  the  CC130  fleet,  using  ten  years  of  data, 
as  requested  by  the  DCOS(Mat). 

We  demonstrate  that  excess  noise  in  the  correlation  measures  between  CC130  performance 
indicators  represents  a  serious  obstacle  in  developing  a  model  that  would  connect  NP 
spending  to  fleet  performance.  Our  results  demonstrate  that  NP  spending  does  not  a  have 
strong  relationship  with  performance.  Since  NP  spending  connects  to  the  larger  economy, 
fluctuations  in  costs  associated  with  the  CC130  fleet  can  quickly  divorce  from  underlying 
financials.  The  excess  noise  in  the  correlation  measures  tells  us  that  randomness  plays  a 
large  role  in  any  apparent  correlation  between  NP  spending  and  performance  indicators. 
Changes  in  NP  spending  at  the  levels  observed  in  the  data  have  little  impact  on  perfor¬ 
mance.  We  can  conclude  that  maintenance  activity  is  highly  robust  -  spending  shocks  of 
the  size  observed  in  the  data  do  not  have  a  statistical  impact  on  major  performance  indi¬ 
cators.  Thus,  given  the  funding  levels  over  the  last  ten  years,  the  CC130  fleet  does  not 
respond  to  spending  fluctuations.  The  methods  we  use  in  this  paper  can  be  applied  to  other 
fleets  and  equipment.  In  addition  to  searching  for  relationships  between  spending  and  per¬ 
formance,  we  can  apply  the  theory  of  random  matrices  to  other  instances  in  which  we  need 
to  examine  correlations  in  data. 
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Au  cours  des  sept  dernieres  annees,  le  SMA(Mat)  a  voulu  mieux  comprendre  les  liens  entre 
les  depenses  d’approvisionnement  national  (AN)  et  le  rendement  de  la  flotte,  dans  l’espoir 
que  ces  liens  marquent  le  point  de  depart  de  1’ elaboration  d’une  procedure  d’ optimisation 
du  financement.  En  outre,  le  fait  de  comprendre  les  effets  que  les  depenses  ont  sur  la  flotte 
pourrait  aider  a  definir  les  quantites  optimales  de  pieces  de  rechange  a  garder  en  stock,  les 
activites  et  les  calendriers  d’entretien  optimaux  ainsi  que  les  meilleurs  calendriers  de  rem- 
placement.  Dans  la  presente  etude,  nous  n’avons  pas  adopte  l’approche  utilisee  dans  les 
etudes  anterieures  qui  avaient  pour  objet  de  cerner  les  liens  entre  les  depenses  et  le  rende¬ 
ment  de  la  flotte.  Nous  appliquons  les  resultats  obtenus  a  l’aide  de  la  theorie  des  matrices 
aleatoires  et  de  la  theorie  des  graphes  afin  de  chercher  les  liens  entre  les  donnees.  Pour 
demontrer  les  methodes,  nous  avons  choisi  d’examiner  la  flotte  de  CC130  dans  la  presente 
etude  et  utilise  les  donnees  sur  dix  ans,  a  la  demande  du  SCEM(Mat).  Nous  demontrons 
que  la  presence  de  bruit  excessif  dans  les  mesures  des  correlations  entre  les  indicateurs 
de  rendement  de  la  flotte  des  CC130  entrave  considerablement  1’ elaboration  d’un  modele 
qui  etablirait  les  liens  entre  les  depenses  d’AN  et  le  rendement  de  la  flotte.  Nos  resultats 
montrent  1’ absence  de  lien  fort  entre  les  depenses  d’AN  et  le  rendement.  Comme  les  de¬ 
penses  d’AN  sont  liees  a  l’economie  dans  son  ensemble,  les  fluctuations  de  cout  associees 
a  la  flotte  de  CC130  peuvent  rapidement  se  separer  des  donnees  financieres  sous  jacentes. 
La  presence  de  bruit  excessif  dans  les  mesures  des  correlations  revele  que  le  caractere  alea- 
toire  compte  pour  beaucoup  dans  toute  correlation  apparente  entre  les  depenses  d’AN  et  les 
indicateurs  de  rendement.  Les  variations  des  depenses  d’AN  de  l’ordre  de  celles  qui  sont 
observees  dans  les  donnees  ont  peu  d’effet  sur  le  rendement.  Nous  pouvons  conclure  que 
l’entretien  est  tres  robuste  ;  les  chocs  de  depenses  de  la  taille  de  ceux  qui  sont  observes  dans 
les  donnees  n’ont  pas  d’impact  statistique  sur  les  principaux  indicateurs  de  rendement.  Par 
consequent,  etant  donne  les  niveaux  de  financement  des  dix  dernieres  annees,  la  flotte  de 
CC130  ne  reagit  pas  aux  fluctuations  des  depenses.  Les  methodes  que  nous  utilisons  dans 
la  presente  etude  peuvent  aussi  s’appliquer  a  d’autres  flottes  et  equipements.  Nous  pou¬ 
vons  utiliser  la  theorie  des  matrices  aleatoires  non  seulement  pour  decouvrir  les  liens  entre 
les  depenses  et  le  rendement,  mais  aussi  dans  d’autres  cas,  lorsqu’il  s’agit  d’examiner  les 
correlations  qui  existent  entre  les  donnees. 
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1  Introduction 


Things  are  seldom  what  they  seem. 

Skim  milk  masquerades  as  cream. 

—  Gilbert  and  Sullivan,  H.M.S.  Pinafore 

1.1  Background 

Over  the  last  seven  years,  ADM(Mat)  has  sought  a  deeper  understanding  of  the  relationship 
between  National  Procurement  (NP)  spending  and  fleet  performance.  The  discovery  of 
linkages  between  spending  and  operational  availability  (A0)  would  provide  a  first  step  in 
the  development  of  a  funding  optimization  procedure.  ADM(Mat)  desires  a  tool  based 
on  NP  spending/performance  relationships  that  would  help  elicit  the  best  possible  fleet 
and  equipment  performance  at  the  best  possible  cost.  Furthermore,  an  understanding  of 
spending  effects  on  fleet  operations  might  also  provide  insight  into  optimal  sparing  levels, 
optimal  maintenance  activities  and  schedules,  and  optimal  replacement  times.  In  a  period 
of  declining  budgets,  ADM(Mat)  requires  an  analysis  of  the  impact  NP  spending  has  on 
DND  fleets. 

While  the  problem  seems  well  posed,  any  potential  analysis  that  attempts  to  isolate  the 
effect  of  spending  levels  on  fleet  performance  faces  extreme  hurdles.  We  immediately  rec¬ 
ognize  that  since  spending  connects  to  a  myriad  of  exogenous  economic  factors,  such  as 
inflation,  price  fluctuations  in  materiel,  and  worldwide  supply  chain  pressures,  a  simple 
one-to-one  map  cannot  exist  between  spending  and  any  performance  measure.  For  exam¬ 
ple,  in  any  one  period,  spending  may  rise  as  the  result  of  an  increase  in  the  cost  of  lubricants 
while  performance  may  decline  due  to  the  discovery  of  an  unexpected  aging  effect.  The 
problem  of  connecting  NP  spending  to  fleet  performance  must  rely  on  a  statistical  analysis 
of  changes  in  both  fleet  indicators  and  costs  as  primary  inputs. 

The  last  five  years  has  seen  periods  of  concentrated  efforts  by  the  Directorate  of  Materiel 
Group  Operational  Research  (DMGOR)  to  uncover  relationships  between  NP  spending 
and  fleet  performance.  Initially,  the  DMGOR  followed  two  promising  methods  to  find 
linkages  in  the  NP  costing  data  and  fleet  performance  indicator  data  with  the  eventual 
goal  of  creating  the  “Providing  a  New  Assessment  for  Costing  Equipment  Availability” 
model  (PANACEA)  [1],  The  first  method  treats  operational  availability  as  the  steady  state 
solution  of  a  differential  equation  that  contains  a  characteristic  relaxation  time.  In  this 
toy  model,  A0  tracks  a  mean  reverting  process  -  A0  relaxes  back  to  its  equilibrium  level 
after  a  shock  disturbs  the  system.  The  model  contains  qualitative  features  that  have  direct 
interpretations  through  performance  indicators  (such  as  mean  time  between  failures,  mean 
down  times,  and  response  parameters)  which  afford  an  asymptotic  solution  in  terms  of 
twelve  free  parameters.  With  the  use  of  suitable  approximations  and  redefinitions,  the 
large  free  parameter  set  can  be  reduced  to  five  inputs.  The  second  attempted  solution 
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uses  feedforward  artificial  neural  networks  to  search  for  a  functional  relationship  between 
funding  and  A0  for  the  CP140,  the  CH124,  the  CC130,  and  the  CF188. 

In  effect,  both  previous  attempts  use  a  filter  method  that  recursively  makes  predictions 
while  updating  internal  system  parameters  at  each  time  step  in  the  presence  of  noise.  The 
system  variables  that  both  models  attempt  to  isolate  act  as  elasticity  parameters  for  NP 
spending  with  performance1.  Since  the  data  contains  noise,  the  filter  techniques  rely  on 
stochastic  methods  to  update  estimates  of  the  response  parameters,  thereby  updating  the 
estimate  of  the  elasticity  of  NP  spending  on  performance.  The  most  celebrated  filter  used 
in  stochastic  control  is  the  Kalman  filter  (for  example,  see  [2],  and  [3]),  which  requires 
an  understanding  of  the  noise  and  covariances  within  the  system.  The  previous  two  mod¬ 
els  have  broad  similarities  with  Kalman  filters  and  thus  we  understand  the  origin  of  each 
model’s  response  parameters  along  with  their  estimates,  which  carry  information  about 
correlations  within  the  fleet’s  time  series  data. 

Unfortunately,  both  models  have  failed  to  answer  the  question  as  to  whether  a  relationship 
exists  in  the  data.  The  mean  reverting  differential  equation  contains  too  many  free  parame¬ 
ters  given  the  summary  level  nature  of  the  data  and  the  data’s  intrinsic  noise.  The  artificial 
neural  network  could  not  be  trained  at  a  sufficient  level  to  find  meaningful  relationships. 
In  fairness  to  the  models  and  the  hard  work  that  went  into  the  attempts,  looking  for  a  map 
between  fleet  performance  and  summary  level  data  hinges  on  approximations.  Each  of  the 
approximations  within  the  models,  while  in  themselves  reasonable,  do  not  capture  enough 
of  the  complexity  of  a  fleet.  Embedded  within  the  fleet’s  operations  are  multiple  queues 
-  from  sparing  to  scheduled  inspections  -  that  interact  in  highly  non-trivial  ways.  On  the 
other  hand,  an  attempt  to  capture  the  entire  fleet’s  operations  bottom  up  and  connect  the 
entire  problem  to  funding  would  not  only  prove  exceedingly  difficult,  but  it  is  not  clear  that 
such  an  undertaking  would  provide  insight  into  funding  relationships2.  In  the  end,  such  a 
model  might  prove  more  descriptive  than  predictive. 

This  paper  will  take  a  different  tack  relative  to  past  approaches.  Instead  of  attempting  to 
build  complicated  analytical  relationships  between  performance  indicators  and  funding,  we 
search  the  data  for  basic  information  content.  Given  that  the  filter  method  approaches  of 
previous  attempts  implicitly  require  an  understanding  of  covariances,  we  focus  our  efforts 
on  the  correlation  structure  of  the  data.  The  limit  of  the  information  content  within  the 

Elasticity  is  a  concept  from  economics.  The  price  elasticity  of  demand  is  defined  as  e  =  ^/rp  where 
Q  and  P  denote  demand  and  price  respectively.  Elasticity  measures  the  effect  of  relative  changes  between 
parameters. 

2  While  a  detailed  bottom  up  stochastic  queuing  model  built  around  filter  methods  might  identify  potential 
opportunities  to  increase  efficiency  by  isolating  bottlenecks  in  the  fleet’s  combined  sparing  policy,  main¬ 
tenance  and  training  schedules,  and  other  performance  driven  activities,  such  a  study  would  require  huge 
amounts  of  data  and  a  tremendous  amount  of  concentrated  effort  by  a  team  of  analysts.  Given  the  compli¬ 
cated  nature  of  several  interacting  queues,  any  attempt  at  removing  a  bottleneck  to  increase  efficiency  may 
lead  to  new  unanticipated  bottlenecks  in  other  parts  of  the  system.  It  is  not  obvious  that  a  detailed  study  of 
low  level  data  would  provide  sufficient  value  above  existing  tools. 
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data  will  establish  the  level  of  connections  that  we  can  make  within  the  fleet  and  address 
the  question  as  to  whether  it  makes  sense  to  embark  on  constructing  a  detailed  model. 
We  adopt  an  agnostic  philosophy  about  which  relationships  should  have  strong  links  -  we 
simply  let  the  data  direct  us.  The  methods  that  we  employ  borrow  heavily  from  quanti¬ 
tative  finance.  In  particular,  we  apply  the  same  techniques  that  quantitative  analysts  use 
to  search  for  stock  hierarchies  and  networks  within  the  market.  Our  problem  of  searching 
for  links  between  fleet  performance  indictors  and  funding  parallels  the  portfolio  problem 
of  identifying  market  sectors  for  optimal  capital  allocation.  By  applying  these  methods, 
we  can  identify  the  hubs  and  clusters  in  performance  and  costing  networks  (or  the  lack 
thereof).  The  number,  size,  and  strength  of  the  clusters  in  the  data  will  determine  the  infor¬ 
mation  content  and  therefore  help  resolve  the  general  applicability  of  filter  methods  with 
NP  spending  and  performance  data.  Given  the  new  approach  taken  in  this  paper,  we  focus 
on  the  methodology  by  using  one  DND  fleet,  the  CC130,  as  a  template  for  the  application. 

1.2  Scope 

ADM(Mat)  requires  a  study  to  identify  exploitable  information  from  the  relationships  be¬ 
tween  NP  spending  and  fleet  performance  to  optimize  NP  allocations.  In  particular,  a 
former  COS(Mat)  [4]  and  the  current  DCOS(Mat)  have  tasked  the  DMGOR  to  develop 
a  model  or  approach  that  will  allow  a  more  logical  articulation  of  the  linkage  between 
the  resources  allocated  to  National  Procurement  in  the  areas  of  spares,  repair  and  over¬ 
haul  (R&O),  and  other  integrated  logistics  support  (ILS)  activities.  In  discussions  with  the 
DCOS(Mat),  the  CC130  Hercules  lift  fleet  was  identified  as  a  priority  for  this  study.  The 
DMGOR’s  response  to  this  request  focuses  on  discovering  information  by  isolating  clus¬ 
ters  in  NP  spending  and  performance  data  with  the  CC130  fleet.  Our  modelling  methods 
aim  to: 

•  use  the  theory  of  random  matrices  to  understand  correlations  within  time  series  data; 

•  identify  clusters  in  the  correlation  data  through  the  use  of  graph  theory  techniques; 
and 

•  establish  the  feasibility  of  continuing  future  studies  seeking  to  create  an  NP  spending 
optimization  model. 

We  obtained  all  performance  data  on  the  CC130  from  the  AEPM  PERFORMA  database 
[5]  and  NP  data  from  Financial  and  Managerial  Accounting  System  (FMAS)  [6]. 

We  organize  the  paper  in  three  parts.  Following  the  introduction  we  explain  the  modelling 
techniques  in  section  2.  In  section  3,  we  examine  the  data  and  display  key  results  from 
the  analysis.  Finally,  section  4  contains  the  conclusions  and  discusses  future  avenues  for 
research.  We  reserve  the  annexes  for  a  detailed  treatment  of  the  mathematical  techniques 
and  technical  definitions. 
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2  Methodology 


To  determine  the  influence  of  NP  spending  on  fleet  performance  we  need  to  examine  how 
changes  in  spending  and  performance  measures  correlate.  Recall  that  the  correlation  coef¬ 
ficient  (for  example,  see  [7])  between  two  random  variables  is  defined  by3 


P(X,Y) 


E[(X-x)(Y-y)] 

o(X)a(Y) 


(1) 


where  x  and  y  respectively  denote  the  mean  of  the  random  variables  X  and  Y,  and  o(-) 
denotes  the  standard  deviation.  The  correlation  coefficient  has  the  range  [—1,1]  for  any 
pair  of  random  variables.  Perfectly  correlated  (anti-correlated)  random  variables  have  p  — 
+  (— )1.  If  two  random  variables  are  highly  correlated,  we  will  find  that  changes  in  one 
random  variable  match  the  changes  in  the  other. 


Searching  for  relationships  in  time  series  data  requires  an  understanding  of  cross  correla¬ 
tions.  In  particular,  the  relationship  between  incremental  changes  in  time  series  data  can 
help  us  reveal  information  content  in  the  data.  Thus,  to  discover  relationships,  it  appears 
we  need  only  to  estimate  the  correlation  coefficient  between  time  series  and  isolate  only 
those  measurements  that  have  a  correlation  coefficient  above  a  predetermined  cutoff  (e.g. 
|p  |  =  0.7).  Unfortunately,  this  simple  approach  can  lead  to  disaster  -  spurious  correlations4 
spoil  our  ability  to  resolve  information,  especially  if  the  data  contains  a  high  level  of  noise. 


As  a  concrete  example,  imagine  that  we  have  20  time  series  constructed  as  uncorrelated, 
i.i.d.  5  Gaussian  noise  with  zero  mean  and  unit  variance.  Given  enough  measurements, 
our  estimates  for  each  cross  correlation  coefficient  will  tend  to  zero.  Unfortunately,  this 
trend  toward  zero  often  requires  large  amounts  of  data  to  become  apparent.  Let  us  assume 
that  our  20  independent  time  series  each  have  30  measurements  and  let  us  construct  the 
correlation  matrix  from  simulated  data.  By  construction,  each  time  series  is  uncorrelated 
yet  the  values  of  the  correlation  matrix  displayed  in  figure  1  show  evidence  of  substantial 
cross  correlations.  If  we  were  to  use  the  numerical  estimates  of  the  cross  correlations  given 
in  figure  1  as  model  inputs  for  cross  correlations,  we  would  be  led  horribly  astray.  A 
priori  we  know  that  no  relationships  exist  in  the  data,  but  the  correlation  matrix  with  only 

3The  operator  E(-)  denotes  the  expectation. 

4Roughly  speaking,  a  spurious  correlation  is  a  relationship  that  occurs  by  chance  with  no  underlying 
connection.  Spurious  correlations  frequently  appear  in  the  popular  media  and  perhaps  the  most  fantastic 
example  is  the  Super  Bowl  Indicator  for  the  stock  market.  This  indicator  claims  that  if  the  National  Football 
Conference  wins  the  super  bowl,  then  the  Dow  Jones  Industrial  Average  will  see  a  bull  market  over  the 
coming  year  whereas  a  win  by  the  American  Football  Conference  will  see  a  bear  market.  The  Super  Bowl 
Indicator  has  a  success  rate  of  approximately  80%  over  the  last  40  years.  This  result  stems  purely  from 
coincidence  yet  the  data  appears  to  have  a  high  level  of  correlation.  When  we  have  a  large  amount  of  data, 
spurious  correlations  -  like  the  Super  Bowl  Indicator  -  will  frequently  appear  and  we  must  guard  ourselves 
from  incorrect  conclusions.  (Lest  our  retirement  invest  strategy  revolves  around  the  performance  of  the  NFL!) 

5i.i.d.  stands  for  independent  and  identically  distributed. 
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Figure  1:  Empirically  measured  correlations  among  20  uncorrelated  time  series  each  with 
30  measurements.  Note  the  large  number  of  significant  spurious  cross  correlations  (off 
diagonal  elements). 


30  measurements  across  20  time  series  contains  many  spurious  correlations.  On  the  other 
hand,  if  we  use  the  same  20  time  series  but  with  an  order  of  magnitude  more  measurements, 
as  displayed  in  figure  2,  we  see  that  the  spurious  correlations  greatly  diminish. 

Searching  for  cross  correlations  in  data  proves  a  difficult  challenge.  The  appearance  of 
spurious  correlations  interferes  with  our  ability  to  separate  bona  fide  information  from 
random  fluctuations.  In  general,  correlation  matrices  obtained  from  real  data  come  with  a 
mask,  called  noise  dressing  [8],  that  impedes  our  ability  to  understand  relationships.  We 
require  an  understanding  of  the  noise  dressing  that  sits  on  top  of  the  actual  correlation 
matrix  to  make  progress  in  identifying  information. 

2.1  Correlations  and  random  matrix  theory 

The  problem  of  noise  dressing  with  correlation  matrices  occurs  in  many  fields  -  from  nu¬ 
clear  physics  to  financial  mathematics  -  and  the  theory  of  random  matrices  [9],  [10],  [11] 
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Figure  2:  Empirically  measured  correlations  among  20  uncorrelated  time  series  each  with 
300  measurements.  Notice  that  the  spurious  cross  correlations  (off  diagonal  elements) 
diminish  relative  to  figure  1 . 


forms  a  pillar  in  understanding  the  limits  of  data.  To  introduce  the  application  of  random 
matrices  to  the  NP  spending  and  the  CC130  fleet  performance  problem,  consider  a  set  of 
time  series  which  contain  known  correlated  sectors.  Let  £,•(?)  denote  the  / - 1 h  time  series, 
where  t  e  1 . 2. 3....T  labels  each  measurement.  Normalizing  each  time  series  by  transform¬ 
ing  the  data  to  standard  form, 


T 


L 


m 

T 


0, 


f  m2 

L *  T 

t= 1 


1,  Vi 


(2) 


we  find  that  the  correlation  matrix  becomes 

1  t= 1 


(3) 


Given  that  we  have  explicitly  assumed  the  existence  of  correlated  sectors,  we  can  write 
each  time  series  as  [8] 


m 


5,.(Q +  ef(Q 

^1+  Ssi 


(4) 
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where  gSj  >  0  (and  sets  the  strength  of  the  correlations),  the  .sy  are  integers  denoting  each 
sector,  and  rjSj(t)  and  £;(/)  are  uncorrelated  i.i.d.  Gaussian  noise  terms.  We  see  that  as 
T  — >  °o  the  correlation  matrix  becomes  block  diagonal,  namely 


Cij  = 


gsiSsifj  +  &i,j 
1  +  gSj 


(5) 


where  denotes  the  usual  Kronecker  delta.  The  block  diagonal  structure  of  C,7  reveals  a 
simple  pattern  for  the  eigenvalues.  For  each  block  ns  we  find  one  eigenvalue 


As,0  = 


1  +gsns 

1  +  <?S 


and  ns  —  1  degenerate  eigenvalues 


Ka  — 


1  +  <?,$ 


(6) 


(7) 


The  eigenvalue  spectrum  of  Cij  give  us  a  clue  on  the  way  in  which  correlated  sectors  emerge 
from  the  data.  We  can  identify  large  correlated  sectors  with  large  eigenvalues  ( i.e .  ns  S>  1). 
On  the  other  hand,  we  see  that  small  eigenvalues  can  arise  from  both  small  and  large  sectors 
that  exhibit  strong  correlations.  Thus  an  excess  of  large  and  small  eigenvectors  can  help  us 
locate  large  and  small  correlated  sectors.  In  particular,  a  large  excess  of  small  eigenvalues 
suggests  the  presence  of  many  small  sectors  with  strong  correlations6. 


While  the  analysis  of  the  block  diagonal  structure  of  C/y  yields  a  qualitative  identifica¬ 
tion  procedure,  we  need  a  more  concrete  framework  to  understand  the  spectrum  of  the 
eigenvalues.  We  see  that  each  block  in  the  idealized  correlation  matrix  contains  one  large 
eigenvalue  and  ns  —  1  degenerate  eigenvalues,  but  random  noise  in  the  correlation  matrix 
will  split  degeneracies  and  alter  the  position  of  all  the  eigenvalues.  Understanding  the  noise 
dressing  in  the  correlation  matrix  represents  a  critical  path  item  for  extracting  information. 


The  central  limit  theorem  applied  to  random  matrices  sheds  light  on  our  problem  [10]. 
The  eigenvalue  distribution  of  large  random  matrices  has  a  calculable  expression.  Thus, 
knowing  the  statistical  properties  of  the  elements  of  a  random  matrix,  we  can  compute  the 
corresponding  eigenvalue  spectrum.  Once  we  obtain  the  underlying  spectrum  associated 
with  a  random  matrix,  we  can  compare  the  result  to  the  spectrum  obtained  from  empirical 
data.  Distortions  in  the  empirical  eigenvalue  spectrum  relative  to  a  random  matrix  signal 
the  presence  of  correlated  sectors.  To  calculate  the  eigenvalue  spectrum  of  a  random  matrix, 
suppose  that  we  have  an  M  x  M  real  symmetric  matrix,  C.  The  matrix  C  will  have  real 

6The  block  diagonal  structure  of  the  discussion  assumes  positive  correlations  between  time  series.  The 
observations  can  be  generalized  to  included  negative  correlations  with  the  introduction  of  spin  variables 
<7,  =  ±1  in  eq.(4).  These  changes  do  not  affect  the  qualitative  arguments  in  the  discussion.  For  more  details 
see  [8]. 


DRDC  CORA  TM  2010-168 


7 


eigenvalues  Aa,  a  =  1,2, 3,  ...M.  We  can  write  the  density  of  eigenvalues  as, 


PW  =  TjZS(l~lll),  (8) 

(7=1 

where  8  denotes  the  Dirac  delta  function.  We  now  define  the  resolvent  of  the  C  as, 

G(A)=(u^)’  <9) 

where  I  denotes  the  M  x  M  identity  matrix.  Using  well  known  results  from  linear  algebra, 
we  can  rewrite  the  trace  over  G(A)  in  terms  of  the  eigenvalues  of  C,  namely 


M 


TrG(A)  =  £ 

a=  1 


1 

A  —  Aa 


(10) 


In  the  large  M  limit,  we  can  use  the  identity  (see  Annex  A  for  proof) 


-J_  =PP-+/7t5(*)  (e  ->•  0), 
x  — 1£  x 


(11) 


where  PP  denotes  the  principal  part,  to  write  the  density  function  as 


p(A)  =  lim-|-Im(TrG(  A -/£)). 
e— m  mk 


(12) 


The  integral  representation  of  the  determinant  of  real  symmetric  matrices  (see  Annex  A) 
allows  us  to  compute  G(A)  in  a  tractable  form.  If  we  assume  that  C  =  HHT,  where  H  is 
an M  xN  matrix  composed  of  i.i.d.  Gaussian  elements  with  zero  mean  and  variance  <J2 /N, 
we  find  that  the  eigenvalue  spectrum  of  C  becomes 


pW 


a/4cj2£?A  -  (ct2(1  -  Q)  +  QA)2 
iTlXo2 


(13) 


where  M  — >  °<y  N  — »  such  that  M/N  =  Q>  l.  Notice  that  the  eigenvalue  spectrum  of 

eq.(13)  contains  Amin  and  Amax  dictated  by  the  real  domain  of  the  radical. 


To  illustrate  the  application  of  random  matrices  to  time  series  data,  let  us  return  to  our 
example  in  which  we  examined  the  correlation  matrix  for  20  time  series  with  30  measure¬ 
ments.  Recall  that  the  resulting  correlation  matrix  revealed  many  large  spurious  correla¬ 
tions  (see  figure  1).  Given  the  time  series  data  set,  we  can  compare  the  empirical  eigenvalue 
spectrum  of  the  correlation  matrix  to  the  spectrum  of  an  infinite  random  correlation  matrix 
with  <2  =  3/2.  In  figure  3,  we  display  the  empirical  eigenvalue  spectrum  from  the  random 
time  series  as  a  histogram  along  with  the  theoretical  eigenvalue  spectrum  of  the  random 
correlation  matrix.  Notice  that  the  theoretical  curve  fully  explains  the  histogram  which 
suggests  that  all  correlations  exhibited  by  the  empirical  correlation  matrix  are  spurious. 
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Figure  3:  Empirical  eigenvalue  spectrum  of  the  time  series  correlation  matrix  given  by 
figure  1  with  the  theoretical  curve  predicted  by  random  matrix  theory.  Up  to  finite  size 
effects,  the  data  is  explained  by  the  theoretical  curve. 

2.2  Minimal  spanning  tree  and  clustering 

The  theory  of  random  matrices  helps  us  determine  the  presence  of  correlated  sectors  in 
the  data.  If  we  find  that  the  spectrum  of  a  correlation  matrix  does  not  concord  with  the 
spectrum  obtained  from  a  random  matrix,  then  we  have  evidence  for  correlated  clusters. 
While  random  matrix  theory  helps  us  recognize  the  existence  of  a  correlation  structure, 
the  eigenvalue  spectrum  does  not  directly  tell  us  the  number  of  clusters  in  the  data  nor 
which  set  of  time  series  form  a  cluster.  To  identify  clusters,  we  will  use  the  stock  market 
and  condensed  matter  physics  as  inspiration.  Methods  in  graph  theory  can  help  identify 
networks  and  hierarchies  in  clustered  data  [12].  In  particular,  the  application  of  graph 
theoretic  techniques  [13]  to  stock  markets  have  not  only  helped  identify  market  sectors,  but 
have  also  imparted  a  deeper  understanding  of  the  entire  economic  organization  of  financial 
markets. 

To  begin  our  application  of  graph  theory  methods  to  our  problem,  we  need  the  concept  of 
distance  among  correlated  time  series,  i.e.  we  need  a  metric  space.  It  can  be  shown  [13] 
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that 


(14) 


dij  —  \/2(  1  Pij 

satisfies  the  requirements  of  a  Euclidean  distance  where  p,y  denotes  the  correlation  coeffi¬ 
cient  between  the  i-th  and  j-th  time  series  (see  eq.(l)).  While  we  can  use  eq.(14)  to  compute 
the  distance  between  any  pair  of  time  series,  we  cannot  directly  use  the  distance  to  isolate 
sectors.  Noise  dressing  interferes  with  a  clean  interpretation  of  the  distance  between  pairs 
of  time  series. 

We  can  overcome  the  shortcomings  of  the  Euclidean  distance  by  placing  the  time  series  on 
an  ultrametric  space.  In  an  ultrametric  space,  the  triangle  inequality  of  a  metric  space  is 
replaced  by  the  stronger  condition  (called  the  ultrametric  inequality) 

dij  <max[dik,dkj\.  (15) 

It  turns  out  that  given  a  metric  distance  with  n  objects,  many  ultrametric  spaces  can  be 
constructed  through  re-partitioning  (see  [13]  and  references  therein).  Of  all  the  ultrametric 
spaces  that  can  be  associated  with  a  distance  dij,  the  subdominant  ultrametric  space  singles 
itself  out  as  it  can  be  obtained  from  the  minimal  spanning  tree  (MST)  that  connects  the  n 
objects.  The  MST  of  a  weighted  graph  is  a  tree  with  n—  1  edges  that  minimizes  the  sum 
of  the  edge  distances.  In  the  end,  subdominant  ultrametric  space  yields  a  unique  indexed 
hierarchy  for  our  problem.  Fortunately  a  simple  algorithm  exists  -  the  Kruskal  algorithm 
(see  [13]  and  references  therein)  -  which  allows  us  to  directly  construct  the  MST  with  a 
Euclidean  distance. 

As  an  illustration  of  the  Kruskal  algorithm,  consider  the  Euclidean  distance  matrix  obtained 
from  a  hypothetical  correlation  matrix: 

A  B  C  D  E  F  \ 

A  0  0.4700  0.8900  1.2000  0.9800  1.1100 

B  0.4700  0  1.0100  0.2000  0.8900  1.3000 

C  0.8900  1.0100  0  0.7500  0.5200  1.1800  .  (16) 

D  1.2000  0.2000  0.7500  0  0.9900  0.7100 

E  0.9800  0.8900  0.5200  0.9900  0  0.4400 

F  1.1100  1.3000  1.1800  0.7100  0.4400  0 

Applying  the  Kruskal  algorithm,  we  need  to  parse  through  the  distance  matrix  proceeding 
pair-wise  from  the  closest  pair  to  the  farthest.  In  our  example  B  and  D  forms  the  closest 
pair  with  distance  0.20  and  the  next  closest  pair  is  E  and  F  with  distance  0.44.  At  this 
point  we  have  two  disjoint  sections  of  the  MST.  We  can  see  these  two  pairs  represented 
by  a  dendrogram  in  figure  4.  Notice  that  the  height  of  the  each  pair  in  the  dendrogram 
corresponds  to  each  pair’s  Euclidean  distance.  We  find  that  the  next  most  closely  connected 
pair  is  A  and  B  with  distance  0.47.  In  the  ultrametric  space,  the  A-B  pair  links  A  and  D  with 
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the  same  distance  as  A  to  B  since  B  is  already  connected  to  D  in  the  tree.  We  see  the 
linkage  in  figure  4:  A  connects  to  the  existing  B-D  pair.  Thus,  ultrametric  ally,  A  is  the  same 
distance  from  B  and  D,  with  the  largest  Euclidean  distance  setting  the  ultrametric  distance 
for  the  two  pairs.  Continuing  through  the  distance  matrix,  we  find  the  next  closest  pair 
is  C  and  E  (distance  0.52),  which  connects  to  F  in  the  tree.  Again,  we  see  this  linkage 
connects  an  existing  pair,  namely,  the  E-F  pair.  Finally,  the  next  connection,  D  and  F  with 
distance  0.71,  completes  the  tree.  If  we  continue  parsing  through  the  distance  matrix,  we 
encounter  pairs  that  have  already  been  fixed  in  the  ultrametric  space.  For  example,  after  the 
D-F  connection  we  have  C  and  D  with  Euclidean  distance  of  0.75,  but  C  and  D  are  already 
connected  in  the  tree  and  so  we  ignore  this  connection.  We  display  the  final  ultrametric 
distance  graph  using  the  full  dendrogram  in  figure  4.  The  colours  of  figure  4  show  clusters 


Figure  4:  Dendrogram  on  the  ultrametric  space  for  the  data  considered  in  the  matrix 
eq.(16).  Note  that,  at  the  70%  of  the  maximum  of  the  ultrametric  space  level,  the  den¬ 
drogram  singles  out  two  clusters  from  the  data. 

that  have  an  ultrametric  distance  less  than  70%  of  the  maximal  ultrametric  distance  in  the 
tree.  Thus,  in  our  example,  if  we  use  70%  of  the  maximal  ultrametric  distance  as  a  cutoff, 
we  would  conclude  that  B-D-A  and  E-F  form  independent  data  clusters.  We  will  use  this 
method  to  identify  the  presence  of  clusters  in  the  NP  spending  and  fleet  performance  data. 


DRDC  CORA  TM  2010-168 


11 


3  Results 


3.1  Data  selection 

This  study  uses  two  data  sources  for  the  CC130  fleet:  the  PERFORMA  database  for  fleet 
performance  indicators  and  FMAS  for  NP  spending  levels.  In  total,  we  select  13  high  level 
performance  indicators  that  are  expected  to  be  significantly  correlated  with  NP  spending. 
Furthermore,  we  break  down  the  NP  spending  into  spares  and  R&O  to  help  identify  corre¬ 
lations  within  spending  subsets.  For  this  study,  we  use  cost  centres: 

•  8485QA:  CC 130  Spares; 

•  8485QB:  T56  Engine  Spares; 

•  8485QH:  CC130  Airframe  Repair  and  Overhaul; 

•  8485QJ:  CC130  Miscellaneous  Engine; 

•  8485QF:  CC130  T56  Engine  Repair  and  Overhaul; 

•  8485TM:  Repair  and  Overhaul  Flight  Navigation  Communication  Equipment  and; 

•  8485UQ:  CC  130  Ties. 

In  the  total  NP  part  of  the  study,  we  used  the  data  from  all  cost  centres  while  in  the 
R&O/spares  breakdown  part  of  the  study  we  use  8485QA,  8485QB,  8485QH,  and  8485QJ, 
8485QF  respectively. 

Constraints  imposed  by  the  data  limit  the  number  of  performance  time  series  that  we  can 
use.  In  constructing  a  correlation  matrix,  we  require  the  number  of  measurements  to  exceed 
the  number  of  time  series  by  approximately  an  order  of  magnitude  to  see  a  signal  above 
the  noise  dressing.  We  use  the  PERFORMA  database  to  extract  10  years  of  monthly  data 
(December  of  1998  to  November  2008)  for  the  performance  indicators,  thereby  giving  us 
120  measurements.  Thus,  we  must  choose  approximately  12  time  series  from  the  database. 
In  the  data  selection  process,  we  need  to  ensure  that  time  series  data  captures  the  fleet’s 
performance  at  a  high  level  with  an  expectation  that  NP  spending  has  an  effect  on  the 
indicators  themselves.  The  performance  indicators  we  use  are: 

1.  All  failures 

2.  A0  -  Overall  operational  availability 

3.  Corrective  maintenance  person-hours  rate 

4.  First  level  A„ 

5.  Flying  hours 
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6.  Mean  flying  time  between  on  aircraft  corrective  forms 

7.  Mean  flying  time  between  on  aircraft  preventive  forms 

8.  Mean  flying  time  between  downing  event 

9.  Off  aircraft  maintenance  person-hour  rate 

10.  On  aircraft  maintenance  person-hour  rate 

1 1 .  On  aircraft  robs  maintenance  person-hour  rate 

12.  Operation  mission  abort  rate 

13.  Preventive  maintenance  person-hour  rate 

A  full  description  of  each  performance  indicator  can  be  found  in  Annex  B.  The  data  we 
select  from  the  PERFORMA  database  concords  with  the  type  of  data  examined  in  past 
attempts  that  address  the  NP  allocation  problem.  Most  of  the  previous  work  focused  on 
A0  as  the  main  object  to  connect  with  NP  spending.  In  this  study,  we  have  broadened 
the  scope  but  maintained  the  original  flavour  of  previous  work  by  examining  only  high 
level  data.  The  larger  scope  of  the  data  will  help  us  discover  possible  indirect  relationships 
between  NP  spending  and  A0. 

We  obtained  the  financial  data  from  FMAS  broken  down  by  spares  and  R&O.  The  data 
covers  the  same  time  frame  (in  monthly  form)  as  the  performance  indicator  data.  The 
financial  data  are  placed  inside  a  13  month  year  to  account  for  spending  invoiced  at  the 
end  of  one  fiscal  year  but  expensed  in  the  following  fiscal  year.  We  correct  for  the  13 
month  year  by  placing  the  data  from  the  13th  month  into  the  first  month  of  the  new  fiscal 
year.  We  understand  that  from  an  accounting  perspective  the  13th  month  represents  a 
separate  entity  to  capture  actual  previous  fiscal  year  spending  relationships,  but  for  our 
study,  we  need  to  treat  spending  as  a  continuous  process.  Moving  the  13th  month  spending 
into  the  first  fiscal  month  of  the  following  year  has  the  effect  of  removing  the  artificial 
discontinuous  jump  that  we  see  in  the  spending  data  at  fiscal  year  changes.  Since  we  desire 
a  relationship  between  incremental  changes  in  the  data,  we  must  ensure  that  we  make 
appropriate  comparisons  with  continuous  time.  We  treat  spending  on  spares,  spending  on 
R&O,  and  total  NP  spending  separately  in  the  analysis. 

3.2  Analysis 

We  break  the  analysis  down  into  two  parts:  performance  indicators  with  total  NP  spending, 
and  performance  indicators  with  spares  spending  and  R&O  spending.  Before  we  apply  the 
models  of  the  previous  section,  we  need  to  put  the  data  in  a  standard  form.  Since  we  are 
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interested  in  correlations  among  changes  in  the  time  series,  we  first  recast  each  time  series, 


Sj(t),  as 


Sj{ti+ 1)  Sj(ti) 


(17) 


The  |  (?)  time  series  represent  the  percent  changes  of  the  original  time  series.  Renormaliz¬ 
ing  each  time  series  by  placing  the  data  in  standard  form,  we  have 


°j 


(18) 


which  implies  that  each  ^j(t)  has  zero  mean  and  unit  variance.  Notice  that  the  differenced 
time  series  £,j(t)  contains  one  less  measurement  than  the  original  time  series.  Constructing 
the  time  series  matrix 

MJi  =  Sj(ti),  (19) 

we  find  that  the  normalized  correlation  matrix  becomes, 


c  -  mmt. 


(20) 


We  use  the  mathematical  structure  of  eq.(17)  through  eq.(20)  for  the  time  series  analysis  in 
the  remainder  of  the  paper.  By  examining  the  time  series  data  using  percent  changes,  we 
immediately  see  that  the  correlation  estimates  will  yield  insight  into  the  elasticity  of  NP 
spending  on  performance. 


3.3  Total  NP  analysis  with  synchronous  time  series 

For  the  first  part  of  the  analysis,  we  focus  on  the  performance  indicators  synchronously 
matched  with  total  NP  spending.  In  total,  we  have  14  time  series  each  with  119  mea¬ 
surements.  To  find  information  content  buried  in  the  correlation  matrix,  we  compute  the 
eigenvalue  spectrum  and  compare  the  result  to  the  Q  =  119/14  infinite  dimensional  random 
matrix. 

Figure  5  shows  the  result  of  the  decomposition.  Notice  that  the  empirical  eigenvalue  spec¬ 
trum  does  not  match  the  expectation  from  random  matrix  theory.  The  Q—  119/14  random 
matrix  yields  the  maximum  and  minimum  eigenvalues  of 

Amax  =  1-80  Amin  =  0.43,  (21) 

and  we  see  that  we  have  an  excess  of  small  and  large  eigenvalues.  While  finite  size  effects 
and  departures  from  normality  in  the  incremental  changes  distort  the  eigenvalue  spectrum, 
the  size  of  the  distortions  that  we  see  points  to  the  presence  of  correlated  sectors  inside 
the  system.  In  particular,  the  excess  of  small  eigenvalues  suggests  the  existence  of  clusters 
that  contain  a  small  number  of  highly  correlated  time  series  in  addition  to  at  least  one  large 
sector. 
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1.4 


1.2 


Eigenvalue 


Figure  5:  The  empirical  eigenvalue  spectrum  (histogram)  for  total  NP  spending  syn¬ 
chronously  matched  to  performance  indicators.  The  theoretical  curve  for  Q  —  119/14 
infinite  random  matrix  is  superimposed. 


Using  the  Kruskal  algorithm,  we  construct  the  MST  for  the  correlations.  Given  that  we 
are  searching  for  relationships  with  either  positive  or  negative  correlations,  we  take  the 
absolute  value  of  the  correlations  in  constructing  the  MST.  Thus,  if  NP  spending  nega¬ 
tively  correlates  with  a  performance  measure,  we  will  interpret  that  performance  measure 
as  being  close  to  NP  spending  in  the  ultrametric  space.  Using  the  absolute  value  of  the 
correlations  allow  us  to  identify  time  series  that  know  about  each  other  regardless  of  the 
type  of  relationship. 

In  figure  6,  we  see  that  the  dendrogram  indicates  a  high  level  of  correlation  between  First 
level  A()  (4)  and  Aa  (2).  We  also  see  that  eight  time  series  associated  with  maintenance 
activities  form  a  larger  block  inside  the  data  and  that  On  aircraft  maintenance  person-hour 
rate  (10)  and  Preventive  maintenance  person-hour  rate  (13)  form  a  tight  subgroup.  From  a 
qualitative  perspective,  the  clustering  we  see  in  the  performance  measures  makes  sense.  We 
expect  to  see  a  high  level  of  correlation  between  different  Aa  measures  as  well  as  between 
certain  types  of  maintenance  activities.  In  building  the  dendrogram,  we  used  85%  of  the 
maximum  ultrametric  distance  as  a  cutoff  for  isolating  sectors  and  the  results  concord  with 
the  distortions  we  observe  in  the  eigenvalue  spectrum  relative  to  expectations  from  random 
matrix  theory.7. 

7The  results  of  this  study  are  not  sensitive  to  changes  in  the  cutoff.  NP  only  begins  to  cluster  if  we  use 
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Figure  6:  Dendrogram  for  NP  spending  (time  series  14)  synchronously  matched  to  perfor¬ 
mance  indicators.  Notice  that  NP  spending  branches  early  in  the  dendrogram  indicating 
that  NP  spending  lies  far  from  the  performance  indicators  in  the  ultrametric  space.  The 
number  labelling  for  each  time  series  is  the  same  as  that  given  in  the  text. 


The  extraction  of  the  precise  number  of  sectors  in  the  data  does  not  represent  the  central 
problem  of  the  exercise.  The  observation  that  the  NP  spending  time  series  (series  14) 
branches  out  early  from  the  dendrogram  in  figure  6  forms  the  central  lesson  -  NP  spending 
does  not  form  a  hub  in  the  MST,  nor  does  NP  spending  correlate  strongly  with  any 
of  the  performance  indicators.  Given  that  ultrametrically  NP  spending  lies  far  from  the 
performance  data  clusters,  it  will  be  exceedingly  difficult  to  construct  a  model  that  evolves 
NP  spending  synchronously  with  performance  data. 

3.4  Total  NP  analysis  with  time  lags  time 

By  relaxing  the  synchronicity  condition,  we  can  search  for  NP  spending  clustering  in  the 
presence  of  time  lags.  Given  that  spending  on  equipment  often  produces  benefits  at  a  later 
date  (after  all,  maintenance  and  improvements  take  time),  we  must  search  for  clustering 
without  the  use  of  synchronous  time  series.  Specifically,  since  NP  spending  represents  the 
key  time  series  in  the  analysis,  we  consider  lags  of  up  to  one  year  in  the  NP  spending  rela- 

95%+  of  the  maximum  ultrametric  distance.  Clusters  with  cuttoff  at  the  95%  level  can  be  explained  by  noise 
alone. 
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tive  to  the  performance  measures.  Lags  in  NP  spending  will  identify  temporal  relationships 
between  NP  allocation  and  the  performance  of  the  fleet. 

Figure  7  shows  the  eigenvalue  spectrum  for  time  lags  of  1  month,  2  months,  3  months,  6 
months  and  1  year.  In  each  case  we  see  an  excess  of  large  and  small  eigenvalues  which 
suggests  the  presence  of  data  clustering.  We  know  from  the  synchronous  analysis  that  some 
of  these  clusters  represent  the  original  synchronous  performance  measure  clusters.  The 
dendrogram  for  each  time  lag  scenario,  shown  in  figure  8,  demonstrates  that  NP  does  not 
form  a  cluster  with  any  of  the  performance  measures.  We  see  that  NP  spending  branches 
out  early  in  each  dendrogram  which  tells  us  that  even  in  the  presence  of  time  lags,  NP 
spending  does  not  significantly  correlate  with  any  performance  measure.  Again,  the 
analysis  suggests  that  model  building  with  NP  spending  and  performance  data  will  prove 
extraordinarily  challenging. 

3.5  Spares  and  R&O  analysis  with  synchronous  time 
series 

We  repeat  the  analysis  for  spares  and  R&O  treated  as  individual  time  series.  In  this  case, 
we  have  15  time  series  in  which  the  last  two  time  series  (14  and  15  respectively)  represent 
spares  and  R&O  spending  while  the  remaining  13  time  series  represent  the  original  perfor¬ 
mance  indicators.  In  figure  9  we  see  the  eigenvalue  spectrum  with  no  lag  in  spending  along 
with  the  corresponding  dendrogram.  Note  that,  again,  we  see  an  excess  of  large  and  small 
eigenvalues  relative  to  the  predictions  from  random  matrix  theory.  In  the  corresponding 
dendrogram  we  see  that  the  performance  indicators  account  for  the  clustering  and,  as  in  the 
total  NP  spending  analysis,  spares  and  R&O  spending  do  not  form  a  central  hub  or  network 
within  the  time  series  set. 

3.6  Spares  and  R&O  analysis  with  time  lags 

The  panels  in  figure  10  show  the  empirical  eigenvalue  spectrum  for  lagged  spares  and 
R&O  spending.  Again,  we  see  an  excess  of  small  and  large  eigenvalues  which  suggests 
the  presence  of  correlated  clusters.  The  dendrograms  of  figure  11  show  that  the  clusters 
do  not  involve  the  spending  time  series.  The  original  correlated  clusters  remain,  while 
spending  on  spare  and  R&O  continue  to  branch  out  early  in  each  dendrogram.  These 
results  suggest  that  using  the  elasticity  for  NP  spending  on  spares  and  R&O  as  parameter 
inputs  for  performance  modelling  will  prove  most  challenging. 
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(c)  (d) 

Figure  7:  Eigenvalue  spectrum  of  performance  indicators  with  lagged  NP  spending  with 
theoretical  curve  (a)  lag  1  month,  (b)  lag  3  months,  (b)  lag  6  months,  and  (b)  lag  12  months. 
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Figure  8:  Dendrogram  of  performance  indicators  with  lagged  NP  spending  (a)  lag  1  month, 
(b)  lag  3  months,  (b)  lag  6  months,  and  (b)  lag  12  months.  Note  that  NP  spending  (time 
series  14)  branches  early  in  each  dendrogram  indicating  that  NP  spending  does  not  form  a 
hub  within  the  MST. 
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Eigenvalue 


(a)  (b) 

Figure  9:  (a)  Eigenvalue  spectrum  with  spares  and  R&O  spending  synchronously  matched 
with  theoretical  curve,  (b)  Dendrogram  for  the  performance  indicators  with  synchronously 
matched  spares  and  R&O  spending  (time  series  14  and  15). 

4  Conclusions 


Searching  for  a  connection  between  NP  spending  and  fleet  performance  indicators  repre¬ 
sents  a  difficult  problem.  Past  attempts,  which  used  an  assortment  of  approaches  based 
generally  on  filter  methods,  have  been  met  with  frustration.  In  each  attempt,  an  underlying 
functional  relationship  was  assumed.  The  essential  idea  of  each  method  centered  on  sta¬ 
tistically  extracting  an  elasticity  parameter  between  spending  and  performance.  As  a  min¬ 
imum,  any  model  connecting  fleet  performance  to  NP  spending  requires  an  understanding 
of  the  level  of  correlation  between  relevant  time  series.  Using  random  matrix  theory  and 
the  concept  of  an  ultrametric  space,  we  find  that  the  frustrations  of  past  attempts  stem  from 
the  high  level  of  noise  dressing  in  the  correlation  matrix.  While  it  is  eminently  reasonable 
to  expect  an  elasticity  relationship  between  NP  spending  and  performance,  the  correlation 
matrix  refuses  to  shed  light  on  this  problem  and  therefore  stymies  model  construction  for 
a  better  articulation  between  spending  and  performance. 

The  application  of  our  techniques  to  the  CC130  have  demonstrated  that  noise  dressing 
represents  a  serious  obstacle  in  developing  a  model  that  would  connect  NP  spending  to 
fleet  performance.  In  some  sense,  we  should  not  be  surprised  by  our  results.  NP  spending 
connects  to  the  larger  economy  with  many  exogenous  factors.  Given  that  fluctuations  in 
the  economy  influence  the  costs  associated  with  the  CC130  fleet,  performance  can  quickly 
divorce  from  underlying  financials.  An  attempt  to  build  a  multiple  regression  model  or  a 
filter  model  that  uses  economic  variables  in  addition  to  fleet  performance  would  almost 
certainly  become  an  unwieldy  ad  hoc  construction.  The  noise  dressing  in  the  correlation 
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Figure  10:  Eigenvalue  spectrum  of  performance  indicators  with  lagged  spares  and  R&O 
spending  with  theoretical  curve  (a)  lag  1  month,  (b)  lag  3  months,  (b)  lag  6  months,  and  (b) 
lag  12  months. 
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Figure  11:  Dendrogram  of  performance  indicators  with  lagged  NP  spending  (a)  lag  1 
month,  (b)  lag  3  months,  (b)  lag  6  months,  and  (b)  lag  12  months.  Note  that  spares  and 
R&O  spending  (time  series  14  and  15)  branch  early  in  each  dendrogram  indicating  that 
pares  and  R&O  spending  do  not  for  a  hub  within  the  MST. 
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matrix  tells  us  that  randomness  plays  a  large  role  in  the  correlation  between  NP  spending 
and  fleet  performance. 

We  should  understand  that  our  results  paint  a  positive  picture  of  the  CC130  fleet.  Changes 
in  NP  spending  at  the  levels  observed  in  the  data  have  little  impact  on  performance.  Broadly, 
we  can  concluded  that  the  maintenance  activity  is  highly  robust  -  spending  shocks  of  the 
size  observed  in  the  data  do  not  have  a  statistical  impact  on  major  performance  indicators8. 
While  we  do  not  see  a  correlation  between  performance  and  NP  spending,  clearly  if  the  NP 
spending  level  is  lowered  sufficiently,  we  will  eventually  begin  to  see  correlations  above 
the  noise  dressing  as  the  fleet  begins  to  starve.  At  some  critical  spending  level,  the  fleet 
would  not  be  serviceable  and  spending  would  correlate  strongly  with  the  ability  to  bring 
individual  aircraft  online.  In  this  sense,  we  expect  that  the  elasticity  of  the  fleet  with  re¬ 
spect  to  spending  will  go  through  a  phase  transition  at  a  sufficiently  low  level  of  funding 
support.  The  results  of  this  paper  show  that  over  the  ten  year  history  from  December  1998 
to  November  2008,  the  critical  spending  level  has  not  been  breached. 

While  we  cannot  connect  spending  to  performance  indicators,  we  can  construct  a  useful 
model  of  Aa.  In  [14],  and  [15]  it  was  shown  that  mean  reverting  stochastic  process  capture 
fleet- wide  A0  and  an  inspection  of  the  CC130  data  suggest  that  such  models  apply  in  this 
case.  In  previous  attempts,  the  analysts  suggested  that  the  best  predictor  of  the  A0  is  the 
current  A0  value  and  that  the  ability  to  predict  A0  separated  itself  from  spending  issues  [1]. 
The  stochastic  mean  reverting  models  add  to  the  picture  by  demonstrating  that  A()  in  many 
military  fleets  have  a  slow  mean  reverting  factor  which  gives  us  a  deeper  understanding  of 
the  fluctuations  involved.  If  ADM(Mat)  desires  a  model  that  simply  forecasts  Aa,  without 
costing  inputs,  the  DMGOR  can  construct  robust  models. 

The  methods  we  use  in  this  paper  can  be  applied  to  other  fleets  and  equipment.  In  addition 
to  searching  for  relationships  between  spending  and  performance,  we  can  apply  the  theory 
of  random  matrices  to  other  instances  where  we  need  to  examine  correlations  in  data.  By 
examining  the  eigenvalue  spectrum  and  comparing  the  results  with  expectations  from  ran¬ 
dom  matrix  theory,  we  can  isolate  the  effects  of  correlated  sectors  -  both  large  and  small  - 
in  the  data.  We  also  avoid  ascribing  spurious  correlations  to  relationships  that  do  not  exist. 
In  the  end,  the  application  of  random  matrix  theory  with  the  concept  of  an  ultrametric  space 
for  constructing  the  MST  will  help  us  prevent  identifying  skim  milk  as  cream. 


8  In  economic  parlance,  the  response  between  spending  and  performance  is  said  to  be  inelastic. 
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Annex  A:  Random  matrix  theory  primer 


In  section  2,  we  learned  that  we  could  write  the  eigenvalue  spectrum  of  a  matrix  as 


PW  =  lin?.T7-Im(TrG(A-ze)). 
e— m  mtt 

In  deriving  eq.(A.l),  we  required  the  identity, 

— - — =  PP—  +  i7id(x)  (e->  0). 

X  —  l£  X 

To  see  that  the  identity  in  eq.(A.l)  holds,  observe  that 


(A.l) 


(A. 2) 


lim 


dx 


=  lim 


£->0  J  — oo  X  l£  € — ^0  , 


x  +  ie 

X2  +  £2 


dx  —  lim  i£ 


£— >0  J — oo  X^  +  £z 


r  dx  +  lim 


£— >0  J —oo  X^  +  c 


;dX- 
(A. 3) 

We  recognize  that  the  integrand  of  the  first  integral  on  the  far  right-hand  side  of  eq.(A.3) 
has  the  form 


lim 


=  $(x) 


(A.4) 


e  "o  7t(x2  +  £2) 

where  we  have  used  a  distributional  identity  for  the  delta  function,  and  we  see  that  the 
second  integral  of  eq.(A.3)  recovers  the  principal  part  of  the  integral  of  1  /x.  Thus,  we  have 
the  identity  eq.(A.2). 

We  are  left  with  the  task  of  computing  Tr  G(A ) .  We  follow  [10]  in  showing  the  calculational 
method.  To  begin,  recognize  that  we  can  write  the  trace  over  G(A)  as, 


N 


TrG(A)=£- 


1 


N 


=  iQgFK*  =  2ndet(Al-C)  =  nTZ(A) 


al-Xa  dx  y 

Since  we  can  write  the  the  determinant  of  a  real  symmetric  matrix,  A,  as  an  integral, 


(A. 5) 


1  \M  r  (  1  M  \  M 

xst)  JwHi.wjA.jjn**  <a'« 


[detA]”1/2  = 

and  since  C  =  HHT  (H  is  an  M  x  N  matrix),  we  can  re-express  Z(> l). 


r  1  M  i  M  N  \  M  /  \ 

Z(A)  =  — 21og  j  exp  f  - -  -  £  (pf+2  E  E  ViVjHikHjk)  fl  '  (A‘ 


•7) 


Instead  of  using  a  specific  realization  of  H,  we  can  compute  with  its  ensemble  average.  It 
is  not  trivial  that  we  can  proceed  with  the  ensemble  average  as  we  are  implicitly  assuming 
that  in  the  large  N  limit  we  can  substitute  the  average  over  the  logarithm  for  the  logarithm 
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of  the  average.  Generally,  we  cannot  make  this  substitution,  but  in  this  case  the  result 
stands  (in  the  large  N  limit)  from  the  replica  trick  used  in  condensed  matter  physics.9  To 
proceed,  we  imagine  that  the  elements  of  H  are  Gaussian  Lid.  noise  with  mean  zero  and 
variance  a2 /N.  In  this  case,  we  find 


1  M  N 

exp  (oEZ  <Pi<PjHikHjk 

'  i,j=ik= 1 


rT2  M 
°  ^  J2 


-N/2 


‘-FEf/ 


i=  1 


(A. 8) 


We  can  re-write  the  expression  for  the  expectation  using  an  integral  representation  of  the 
delta  function  with  q  =  <J2Y^L\  (P2, 


«U-*Tv]  =  /5>>M*-o2E£)K 


M  m2 
2  V"  S Pi 


1=1 


N 


2k 


i=  1 


N 


(A. 9) 


which  allows  us  to  write, 
ZW  =  -21ogT/// 


Making  the  substitution  z  =  —2iz/N  we  can  recast  Z(A)  as, 


A 

exp  (  -~(p2  )  (1—q)-^  exp  I  ix  I  q 


M  m2\\  M 

-2E-))^n 


(A.  10) 


Ni  fl°°  f°°  l  M  \ 

Z(A)  =  — 21og  —  J  I  expl-—(\og(?i-o2z)  +  Q\og{l-q)  +  Qqz)Jdqdz 

y  (A.  11) 

where  Q  —  NjM.  Using  the  saddle  point  method  (also  known  as  Laplace’s  Method,  see 
[16]), 

rb 


exp  (Mf(x))dx  — 


2k 


M\f'(x0)\ 

where  a'o  is  the  saddle  point,  we  can  find  that 

t*2 


exp (Mf(xo))  {M  — >•  °o), 


Qq  = 


A  -  C7- 


z  = 


l-q 


which  has  the  solution 


g2(l  -  Q)  +  Ql  ±  V(g2(l  -  Q)  +  gA)2  - 4 a^QX 

2QX 


(A.  12) 


(A. 13) 


(A.  14) 


We  can  now  readily  find  G(A)  by  differentiating  Z( A),  which  yields 


(A.  15) 


9The  general  idea  is  to  replicate  the  system  by  using  m  products  of  the  system.  Once  the  replicated  system 
is  averaged  over  the  m  products,  the  limit  m  — >  0  is  taken  to  reveal  the  result. 
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Using  the  imaginary  part  of  G(A),  we  find  that  eq.(A.l),  the  density  of  the  eigenvalues, 
becomes, 


pW  = 


a/4ct22A  -  (cr2(l  -  Q)  +  QX): 


IkXo1 

Note  that  eq.(A.16)  has  a  maximum  and  a  minimum  in  the  spectrum,  namely, 


(A. 16) 


q  max _  -.2 

Amin  -  ° 


(A. 17) 


Thus,  random  matrix  theory  predicts  that  we  should  see  all  the  eigenvalues  contained  within 
the  range  [Amin,  Amax]  and  distributed  according  to  the  spectrum  given  in  eq.(A.16).  Figure 
A.l  shows  the  spectra,  p(A),  for  Q  =  1,2,5. 


Figure  A.l:  Eigenvalue  spectra  for  infinite  random  matrices  with  Q  =  1,  Q  —  2,  Q  —  5. 
Note  the  presence  of  Amin  and  Amax  in  each  case. 
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Annex  B:  Fleet  indicator  definitions 


This  Annex  contains  the  definitions  of  the  fleet  indicators  used  in  the  analysis  provided  in 
this  paper.  The  definitions  listed  below  are  taken  verbatim  from  the  PERFORMA  database. 
Further  technical  information  can  be  found  in  the  PERFORMA  database[5]. 

All  Failures 

Definition:  All  Failures  are  the  sum  of  the  On- A/C  Failures  and  the  Off- A/C  Failures. 

•  On- A/C  Failures:  Total  number  of  failures  recorded  on  a  CF  349  form  against  a 
piece  of  equipment  installed  on  an  Aircraft.  Those  are  determined  from  all  entries 
recorded  on  the  On- A/C  CF  349  maintenance  forms  against  any  valid  WUC,  where 
the  equipment  had  to  be  replaced  or  repaired  in  order  to  return  the  Aircraft  to  a 
serviceable  status.  This  includes  all  valid  Sequence  1  and  2  line  entries. 

•  Off-A/C  Failures:  Total  number  of  failures  recorded  against  uninstalled  equipment. 
An  Off-A/C  form  is  defined  as  a  CF  349  form  without  an  Aircraft  number  or  a  CF  543 
form.  A  failure  will  have  a  Fix  =  3  or  for  non- serialized  items,  the  Fix  =  6  with  a  con¬ 
tractor  Fixer  Unit  Code  (3  letters)  and  a  supplementary  data  of  TFRO/TFIR/TFM. 

Ao  -  Operational  Availability  as  %  of  time 

Definition:  (Ao)  Operational  Availability  as  %  of  time  is  the  proportion  of  observed  time 
that  a  group  of  Aircraft  is  in  an  operable  state  (not  undergoing  maintenance)  in  relation 
to  the  total  operational  time  available  during  a  stated  period.  Operational  Availability  as 
percentage  of  time  is  calculated  using:  Ao  =  Up  Time  /  (Up  Time  +  Down  Time)  Where: 
“Up  Time”  is  the  total  actual  number  of  calendar  hours  where  the  selected  Aircraft  are 
not  undergoing  any  maintenance  action  during  the  chosen  period  (no  open  CF  349)  and 
the  Allocation  Code  is  not  “FX”.  And:  “Up  Time  +  Down  Time”  is  the  total  number  of 
calendar  hours  included  in  the  selected  period  of  the  analysis.  In  calculating  all  downtimes 
and  uptimes,  the  date  and  time  are  translated  to  the  nearest  hour  based  on  24/7  operations. 

Corrective  Maintenance  Person-Hours  Rate 

Definition:  Total  number  of  “Maintenance  Person-Hours”  reported  on  CF  349  and  CF 
543  corrective  maintenance  forms  for  every  1000  hours  flown  by  a  specific  fleet.  This 
calculation  involves  three  defaults  when  examining  MPHRs  for  a  particular  component. 

•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 
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•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 

First  Level  Availability 

Definition:  First  Level  Availability  (First  Level  Ao)  is  the  proportion  of  observed  time 
where  routine  maintenance  is  not  carried  out  on  the  group  of  “First  Level  Aircraft”  (First 
Level  Up  Time),  in  relation  to  the  total  cumulative  time  where  those  Aircraft  could  have 
been  available  (First  Level  Total  Time).  The  First  Level  Availability  is  based  on  the  time 
that  an  aircraft  is  considered  to  be  in  First  Level  and  not  on  calendar  time.  Therefore, 
an  aircraft  may  be  in  First  Level  for  only  two  days  in  one  month  and  have  First  Level 
Availability  of  80%  for  that  month  if  it  was  available  for  80%  of  the  time  that  it  was  in 
First  Level.  First  Level  Availability  is  an  availability  calculation  done  specifically  for  the 
group  of  “First  Level  Aircraft”  which  are  those  that  are  considered  to  be  used  for  the  daily 
flying;  they  are  owned  by  military  units,  have  an  allocation  code  “CX”  or  “GX”  and  can 
either  be  serviceable  or  be  undergoing  “First  Level  maintenance”,  generally  1st  level  of 
maintenance.  First  Level  Availability  is  calculated  using:  First  Level  Ao  =  (First  Level 
Uptime)  /  (First  Level  Total  Time)  Where:  The  “First  Level  Total  Time”  is  calculated 
using:  First  Level  Total  Time  =  (First  Level  Uptime  +  First  Level  Downtime) 

Note  that  the  First  Level  Total  Time  is  not  necessarily  the  complete  calendar  time  for  the 
query  expression  but  the  calendar  time  during  which  an  aircraft  was  considered  to  be  in 
first  level.  The  downtimes  excluded  from  the  “First  Level  Total  Time”  calculation  are  the 
downtimes  for  a  distinct  tail  number  where  the  CF  349s  reporting  On-A/C  maintenance 
work  are  from  one  of  the  following  categories: 

•  “Non-routine  maintenance”  action  (see  list  below); 

•  ’’Routine  maintenance”  (see  list  below)  occurring  simultaneously  with  a  non-routine; 
maintenance  action  (i.e.  put  u/s  date  of  the  "routine  maintenance"  is  during  a  “non¬ 
routine  maintenance”  form  downtime); 

•  Maintenance  action  reported  by  2nd  or  3rd  line  (i.e.  How  Found  =  D);  and 

•  Maintenance  action  reported  by  a  non-military  fixer  unit  (i.e.  alphanumerical  fixer 
unit) 

The  “First  Level  Downtime”  is  calculated  from  the  downing  events  for  a  distinct  tail  num¬ 
ber  where  the  CF  349s  reporting  On-A/C  maintenance  work  are  not  from  the  four  categories 
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listed  above.  The  downtimes  for  all  these  downing  events  are  calculated  for  each  distinct 
tail  number  and  added  up  to  get  the  total  “First  Level  Downtime”.  A  downing  event  down¬ 
time  is  composed  of  a  single  or  a  group  of  CF  349s  reporting  work  performed  On-A/C 
(i.e.  CF  349s  must  have  a  tail  number)  from  the  time  the  Aircraft  was  first  put  u/s  to  the 
completion  of  the  maintenance  work  that  brings  the  Aircraft  to  a  serviceable  status.  The 
downtime  calculation  for  any  downing  event  starts  when  a  CF  349  form  is  opened  against  a 
distinct  tail  number  (put  u/s  date-time  when  the  Aircraft  becomes  unserviceable)  and  ends 
when  the  last  CF  349  is  closed  ’  (last  certified  serviceable  date-time  bringing  the  Aircraft 
back  to  a  serviceable  status)  The  “First  Level  Up  Time”  is  defined  as  any  period  where  a 
First  Level  Aircraft  is  not  undergoing  maintenance. 

Flying  Hours 

Definition:  Total  flying  hours  recorded  by  the  aircrew  during  a  given  time  period  as  re¬ 
ported  via  the  monthly  AUSR  report. 

Mean  Flying  Time  Between  On-A/C  Corrective  Forms 

Definition:  Average  elapsed  flying  time  between  two  consecutive  On-A/C  Corrective  Forms. 
This  is  determined  by  dividing  the  total  operating  hours  of  a  piece  of  equipment  over  a 
given  period  by  the  total  number  of  On-A/C  Corrective  Forms  recorded  against  that  equip¬ 
ment.  For  periods  with  no  forms  or  events  occurring,  the  operating  hours  will  be  shown. 

Mean  Flying  Time  Between  On-A/C  Preventive  Forms 

Definition:  Average  elapsed  flying  time  between  two  consecutive  On-A/C  Preventive  Forms. 
This  is  determined  by  dividing  the  total  operating  hours  of  a  piece  of  equipment  over  a 
given  period  by  the  total  number  of  On-A/C  Preventive  Forms  recorded  against  that  equip¬ 
ment.  For  periods  with  no  forms  or  events  occurring,  the  operating  hours  will  be  shown. 
This  parameter  is  more  suitable  for  analysis  at  the  system  or  component  level. 

Mean  Flying  Time  Between  Downing  Events 

Definition:  MFTBDE  indicates  the  average  flying  hours  between  two  consecutive  Aircraft 
Downing  Events.  A  downing  event  refers  to  any  single  occurrence,  or  group  of  occur¬ 
rences,  where  an  Aircraft  is  brought  from  a  Serviceable/Operational  status  to  an  Unser¬ 
viceable/Repair  status.  These  include  both  Preventive  and  Corrective  Maintenance  Actions 
reported  against  an  operational  Aircraft.  Only  forms  with  a  numerical  fixer  unit  are  in¬ 
cluded  in  an  event.  A  downing  event  may  include  several  failures  that  are  all  repaired 
following  the  single  downing  event. 

Off- A/C  Maintenance  Person-Hours  Rate 

Definition:  Total  number  of  “Maintenance  Person-Hours”  reported  on  “Off- A/C”  forms  for 
every  1000  hours  flown  by  a  specific  fleet  or  selected  Aircraft. 
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This  calculation  involves  three  defaults  when  examining  MPHRs  for  a  particular  compo¬ 
nent. 

•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 

•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 

On- A/C  Maintenance  Person- Hours  Rate 

Definition:  Total  number  of  “Maintenance  Person-Hours”  reported  on  “On- A/C”  forms  for 
every  1000  hours  flown  by  a  specific  fleet  or  selected  Aircraft. 

This  calculation  involves  three  defaults  when  examining  MPHRs  for  a  particular  compo¬ 
nent. 


•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 

•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 

On  Aircraft  Robs  Maintenance  Person-Hour  Rate 

Definition:  Number  of  “Maintenance  Person-Hours”  reported  against  a  ROB  on  “On- A/C 
forms”  for  every  1000  hours  flown  by  a  specific  fleet  or  selected  Aircraft.  This  calculation 
involves  three  defaults  when  examining  MPHRs  for  a  particular  component. 

•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 
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•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 

Ops  Mission  Aborts  Rate 

Definition:  Total  number  of  “Ops  Mission  Aborts”  reported  for  every  1000  hours  flown  by 
a  specific  fleet. 

Rate  calculations  involve  three  defaults. 

•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 

•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 

Preventive  Maintenance  Person-Hours  Rate 

Definition:  Total  number  of  “Maintenance  Person-Hours”  reported  on  CF  349  and  CF 
543  preventive  maintenance  forms  for  every  1000  hours  flown  by  a  specific  fleet.  This 
calculation  involves  three  defaults  when  examining  MPHRs  for  a  particular  component. 

•  Installation  Factor  (IF):  Quantity  of  the  same  item  that  is  installed  on  a  single  Aircraft 
(e.g.  there  are  two  engines  on  the  Aircraft).  The  Installation  Factor  information  is 
not  available  so  1  is  used  as  default. 

•  Fitment  Factor  (FF):  Proportion  of  a  fleet  onto  which  equipment  is  fitted  (e.g.  EW 
equipment  is  not  installed  on  all  Aircraft).  The  FF  information  is  not  available  so  1 
is  used  as  default  (for  100%  of  fleet). 

•  Duty  Cycle  (DC):  Proportion  of  time  a  piece  of  equipment  is  on  when  an  Aircraft  is 
operating  (e.g.  even  when  installed,  EW  equipment  does  not  operate  for  the  entire 
duration  of  a  flight).  The  DC  information  is  not  available  so  1  is  used  as  default  (for 
100%  of  mission  time). 
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List  of  Acronyms 


ADM(Mat)  Assistant  Deputy  Minister  (Materiel) 
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